Campaign Manager - Best Practice Guides
Best Practice - iLoader - Strategies For Loading DataIntroductionThis guide describes iLoader best practice in a number of areas including logging, indexation, data engineering and update strategies. It does not replace the iLoader help file or the formal iLoader training course but provides some useful insight into some best practices in those areas.
Anyone intending to use iLoader should first attend a formal Alterian training course. ![]() By way of a refresh to existing administrators, this section runs through a few behaviors that are now default for Engine. Existing scripts that have not been updated for a few releases may well be performing processes that are now redundant and removing them will reduce load time and therefore downtime for the system. Familiarity with these behaviors is key for on-going load optimizations. Incremental LinkingFunctionality implemented for Incremental Linking in Engine 5.1 means that if you Append or Delete data from your database, you no longer have to drop all the links before the Append/Delete or re-create them afterwards. Engine will re-calculate only those links that are required for the new data. It is highly recommended that scripts are edited to remove drop link statements in order to benefit from the performance improvements of this change. Existing scripts that include a drop link statement will continue to work, however, improved performance can be achieved by not dropping links unnecessarily. External IndexingExternal Indexing is now switched on by default. This is the process of performing link processing within a NucleusSurrogate.exe process and not the standard NucEngine.exe As link are high CPU usage, having them executed in an external surrogate processes removes some of the processing load on the NucEngine.exe. It should be notes that this is an area where an under spec’d machine could cause CPU contention so this should be carefully monitored. ExclusivityExclusivity is no longer controllable by the user in Engine 7.0.0. It is handled automatically by Engine via the new locking mechanism introduced in Engine 7.0.0. Enhanced SecurityThis setting is switched on by default, and can be switched off if required ![]() Understanding the object dependency model from Engine 7.x with Campaign Manager 5.1Prior to this, the relationship between database objects was based on their NameEx i.e. You would create an ‘Age Band’ decode column that uses the NameEx [Demo].[Customer].[Age] to find the source column. To ensure data integrity as part of the locking improvements introduced in Engine 7.0 with CM 5.1, the relationship between database objects is now entrenched by a unique, automatically allocated ID, and not their NameEx. Additionally, all engineered objects now track the specific link path used when they were created. This means that if you refer to an object in an engineered field, and then delete the object and recreate it with the same name, then the reference will be lost as the object will have a new unique identifier and the expression will be invalidated. The same invalidation will also occur if you delete a link that the expression is traversing to get to its object. Example 1Pre-Engine 7.0.0 behavior
Engine 7.x behavior
Example 2Pre-Engine 7.0.0 behavior
Engine 7.x behavior
Renaming Tables and ColumnsThis is possible via iLoader but you must be aware of the dependency model, i.e. referring to Example 1 above, if you change a column called [Age] to [AgeOld] and create a new column called [Age] then any related objects will still be linked to [AgeOld] via the object’s unique identifier. Getting Started![]() Starting iLoaderit is recommended initially that iLoader is started via AMC and not via any default Start menu items as these are not Engine project aware. Logging and AuditAt the start of any new load process, it is recommended that a new Cerlog is started. This is to prevent confusion and makes it easier to identify issues. Use the START_NEW_ENGINE_LOG command to do this. If possible, we recommend including logging information to your script to be used for data audit purposes. There are two ways of doing this: The PUBLISH command creates a text or tab based Data Audit. Refer to the iLoader Help File for further details. The DATA AUDIT command produces a file containing information about the database. For performance reasons, you can now exclude the statistical portion of the audit file by adding the parameter NOSTATS: DATA_AUDIT NOSTATS (no statistics shown in audit) DATA_AUDIT STATS (statistics will be shown) Stats are switched on by default and old DATA_AUDIT commands with no switch will default to the show stats method. ![]() When loading data into Engine or backing up the repository, it is a key requirement that the Project state is maintained for the duration of the load and Engine does not start up, either automatically, or because a user attempts to log into a client application. In an enterprise environment with multiple client applications and geographically distributed users, this becomes a difficult process to manage for administrators. Exclusive Maintenance Mode functionality enables administrators to put the Engine Project into an exclusive 'single user' state while administrative tasks are completed. This functionality includes several iLoader commands and NucBroker features. How it Works
![]() ;; Check the initial Project state, exit scripts if server crashes (PreLoad script file) IF_PROJECT_IN_RECOVERY_STOP ;; Set the access token, this is optional, as a default value exists. ;; Do not alter token value if Project has been changed to inactive or restricted states! SET_ACCESS_TOKEN secret_token ;; Optional clean up. This used to be done in the control file (with the optional ENABLE_CLEANUP=True). ;; It’s now been moved to separate and controllable commands SET_PROJECT_MODE_ACTIVE REMOVE_TEMP_FILES REMOVE_BAD_LINKS CLEANUP_DATABASE ;; Now disallow any RESTARTS of Engine. This does not stop the server if it is running SET_PROJECT_MODE_INACTIVE ;; Wait up to 1 minute for the server to shut-down on its own (if it is running) WAIT_FOR_SERVER_STOP 60 ;; if it is still running then shut down by disconnecting COM ;; NOTE: server cannot restart as Project in now INACTIVE
IF_NOT_SERVER_STOPPED RECYCLEENGINE 1,1 WAIT_FOR_SERVER_STOP 120 IF_NOT_SERVER_STOPPED STOP ENDIF ENDIF ;; the server is stopped and WILL NOT restart, alter the file system in safety ;; switch to restricted LOAD mode, the secret token prevents other users from starting / using Engine
SET_PROJECT_MODE_RESTRICTED ;; the server is in restricted mode for normal loader commands ;; Loading complete. Optionally, force an Engine stop WAIT_FOR_SERVER_STOP 120 IF_NOT_SERVER_STOPPED RECYCLEENGINE 1,1,false WAIT_FOR_SERVER_STOP 120 IF_NOT_SERVER_STOPPED SET_PROJECT_MODE_INACTIVE STOP ENDIF ENDIF ;; make the project available to all (PostLoad script file) SET_PROJECT_MODE_ACTIVE Database Column DefinitionsAs part of database design, it is important to know information about the data to be loaded and to optimize the data loading options especially around the subject indexing. In the past, administrators of an Engine databases have not given this area the attention and to some extent, have not had the tools to do so. As data volumes grow and client's requirements for minimizing downtime increases, the time taken for the load process is increasingly being scrutinized to achieve the goals of:
The following sections are aimed at Engine Database Administrators, both experienced and new, to act as a resource for making the key decisions to achieve optimal balance of load and performance. Administrators can positively impact on storage, performance and load down time, by taking time to learn options and apply the correct one. ![]() When we consider a column to be loaded, we will initially define a Data Type, i.e. UNICODE, TEXT, INTEGER, REA etc, but also important are other properties like cardinality (number of discrete values), max width and the scope of usage of that column within the database. CardinalityWhat is the maximum number of discrete values the column is likely to have over its life cycle. This defines the underlying Engine storage and is absolutely key. Cardinality is initially classified as:
Scope of UsageHow a column is used will directly affect decision to index, which is covered later in the index strategy section but administrators would need to know:
WidthThe defined width of a column determines its disk footprint and under certain circumstances its storage when considering MEMO fields. It would also govern the width of a fixed width export.
![]() OverviewAs discussed above, part of database design is to now information about the data to be loaded and for each column in the database and indexation is largely focused on how it will most likely be used. These decisions are key to achieve the optimal performance of the data both in reducing load times and therefore system down time, by only indexing those column that need it, as well as performance by ensuring that Engine has the indexes available to give the maximum response times to queries and is stored optimally for size on disk. more than ever, these decisions are important within an Engine repository. Carrying out unnecessary indexing will:
It is certainly true that in this area, the administrator can impact on storage and performance, and taking time to learn options and apply the correct one, can reap rewards later down the line. As data is loaded, the default option is to load all fields indexed. This is achieved simply by declaring the column as TEXT, INTEGER etc. To load a column un-indexed simply add a “U” to this declaration i.e. UTEXT, UINTEGER. The following should be considered when looking at column indexes on tables:
Dropping and Rebuilding IndexesThere is no longer any need to do this, especially when doing Appends. Parallel IndexingFor some customers, indexing takes up a large proportion of an overall load process. If this is the case then it may be useful to run multiple indexes in parallel. When considering this option, consider the following:
Assuming the answer to all the above is yes parallel indexing may offer significant time savings. Implementation OptionsThere are two ways of achieving parallel indexing:
Furthermore, there are other decisions to be made depending on the infrastructure available:
The decision on whether to make use of a Support Server and implement distributed processing is only an issue if a support server is available and if it has good (by which we mean fast network) access to the Repository. Assuming it does, and assuming the Support Server is of comparable specification to the Engine server, it should prove more optimal to make use of the support server as there will be less competition between the index processes for machine resource. If the decision has been taken to run multiple concurrent indexes, there are then two possible options:
How many Parallel Indexes can I have?There is no straightforward answer to this unfortunately – it depends on hardware and the individual environment. There will be a point at which adding additional instances will have no benefit, and this point can only be found through trial and error on your own environment. By default, the Engine server is configured to run a maximum of two instances. This value can be modified by using the Engine External Process Configuration Applet in AMC. We recommend that this number is changed gradually and you do not start by ramping it up significantly. Additional servers can also have Max Instances controlled on a server-by-server basis. Switching on Parallel IndexingThis is turned on by default. EngineeringTo make use of parallel indexing when creating engineering fields for the first time on a brand new table, create the fields as UNINDEXED. It is possible to create EXPRESSION fields that are un-indexed by using UEXPRESSION instead of EXPRESSION. UEXPRESSION destination_tbNameEx, fdName, {opt - Format}, {Expression Clause} It is also possible to create un-indexed AGGREGATE fields. UAGGREGATE, Full Table Name, Short Name for Field, Type of Aggregate, Full Target Field Name, {Full Alterian Engine SQL of filter to apply or blank}. Then use CREATEINDEXES to index the fields. Partial IndexingWhen appending to an existing table, where possible Engine will only index the new rows rather than re-indexing the entire table, which was necessary in older versions. ![]() OverviewData Type selection is mostly a clear decision based on the understanding of the raw data, certainly to the basics of TEXT, INTEGER, BIGINT, REAL, DATE, TIME and DATETIME, but with this release, there are further options for loading textural data. So strictly speaking TEXT is used for single byte data sets and UNICODE for double-byte data sets, but a new UNICODE storage type has characteristics that could be useful for single-byte data and in some cases the choice of UNICODE could be the best choice. To better understand this, it is important to understand how Engine column storage works Data Types have different storage based on their cardinality, and each storage has different characteristics, so knowledge in this area may also help drive choice of Data Type. It is important to understand that Engine decides storage based on user defined Data Type so it is the decision of Data Type that is important. There are 3 different storages in Engine, called DISCRETE, GENERIC and VARUNICODE and the following table shows how these are implemented.
Having looked at the storage that is employed by Engine, the following table shows the key characteristics. It should be noted that these are general trends but as this many any configurable options, experimentation is recommended on particular data shapes.
Storage Summary
The four very key pieces of information are
![]() When considering an Intra-Day update strategy, please see See Campaign Manager - Best Practice Guides. Data Update StrategiesThis section is intended as a guide for Engine administrators planning an iLoader Load Strategy that makes use of Append and Update functionality to reduce the overhead of doing a full data refresh, with the aim of reducing load times. The recommendations and examples outlined in this document below are based on a series of benchmark tests performed by Alterian using a combination of real customer data and manufactured data. As every Engine database is different in terms of data shape and schema design, use this document as a general guide for formulating a load strategy that is best suited to your own database. If you have any specific issues or problems, please contact your Alterian partner representative.
![]() This document uses the following terminology when referring to the different methods of updating Engine data. Further detail on each function is included in the body of the document. Append: This refers to the process of adding new records to an existing table; the records will be appended to the end of the table. Update: This refers to the process of updating existing rows of data by making in-place column updates where the field value in the update file is different to existing column value. Full Refresh: This refers to the process of loading a new table, or reloading an existing table in its entirety. When performing a refresh, if a table already exists it will be dropped and reloaded. ![]() AppendsAppends allow Engine administrators to add new records to their databases without having to do a time consuming full refresh. Adding a small amount of data to a large table is more efficient in almost all cases. Append functionality is optimized for high cardinality, non-memo, partitioned columns. Data with these characteristics will enjoy the highest performance gain. As every customer database is unique, a process of iterative testing and investigation is required to optimize your load strategy; there is no template solution that can be applied to all databases. Incremental appends have always been possible, but changes included in recent 5.x releases have made them more viable through the improved performance delivered by the 64-bit architecture, and a number of functional enhancements including: Appends v Full RefreshIt makes sense that it is more efficient to add a small amount of data to a large table than to reload the entire table. We wanted to measure just how much more efficient this was, and try to establish a general rule for when appends would become less efficient than a full data refresh. MethodologyA full refresh was timed on three different sized databases, with a mix of field cardinality to approximate real customer databases:
On each database, we have then timed appends of different sizes: 1% > 25% > 66% ResultsAs expected, the largest performance gain was achieved on the smallest append (1%). 990 million row load timed at 5h 16m 10 million row append timed at 0h 16m However, even when doing a 25% append, a greater than 50% performance gain was recorded. 800 million row load timed at 4h 39m 200 million row append timed at 1h 58m Extrapolating from our final test, the tipping point for our test database where a full refresh would become more efficient than an append would be in the region of 68%. 600 million row load timed at 3h 07m 400 million row append timed at 2h 59m Updates and In-place UpdatesFor the Campaign Manager 6.0 release, Alterian have implemented a full In-place update process for all UPDATE processing, meaning the update is made to the data in-situ and an update will never involve the creation of temporary tables and never change the order of the rows in table. This is a much more efficient process and will be used by all UPDATE processing by default. Rolling Table UpdatesThe classic scenario where rolling table updates would be useful is where there is a large transaction table with a monthly update cycle to drop the oldest month’s data and add the latest months. This segmented data enables much more efficient deletion of sections of data. With partitioned tables, for example using a monthly partition method, if you are loading for example June 2011 and dropping June 2010, the 11 months in-between are all stored in their own partitioned data files. The advantage of partitioned data files is in the delete speed. Only data files that contain data that needs deleting will be compacted, rather than an un-partitioned data file where all data will be compacted. Rolling Transaction Update refers to a situation where the following are all true.
ProcessTable partitioning is enabled with the following command: EnableUpdateTransactions [Demo].[Order], ON Once enabled, each subsequent append will create a new partition. (It is recommended that the EnableUpdateTransactions command be specified before each append. PartitioningPartitioning is a deletion optimization and is only beneficial if the rows are already in the order that they will be deleted. If this is not the case then partitioning should not be used. PARTITION CommandThis command should be used one time when you have decided that the table is suitable. The partition command groups the rows into separate physical files as identified by the specified row selection domain. The rows are not re-ordered or altered in any way. Partition [Demo].[Order], {Select Distinct([Demo].[Order].[ALLMONTHS]) from [Demo].[Order]} Partitioning an existing table using the PARTITION command is only of benefit if you are going to carry out regular rolling deletes. As explained above, the data MUST already be loaded in deletion order i.e. by transaction date and you will be deleting by transaction date. Data Engineering on Changing DataThis subject is covered in detail in the Best Practice Guide Data Engineering in Campaign Manager and Engine. It is recommended that you read this document as this release contains new functionality that will impact your data engineering strategy. A high-level summary is provided here, concentrating on data engineering performed by iLoader via the overnight load. ![]() In Campaign Manager Campaign History data is held in a set of three default Engine tables which are updated continuously through the day while Engine is running. Events relating to campaign activity (Email opens, Links Clicks etc.) are added to these tables and users will be able to report on these tables to get a real-time view of how their campaigns are performing. The three-table structure consists of:
Although the three Campaign History tables are system tables, it is possible to create engineered columns using them. Be aware that the data will change throughout the day and take this into account when creating columns that may be used by users in reports or in campaign audience logic. Your users may suddenly ask “Why are there suddenly 100,000 people in my campaign audience – there were only 5,000 this morning?” Engine 7.0.0 includes new column types and data transformation commands that deal with the requirement to update engineered columns during the day, or create them as static snapshots that will not change until they are recreated during the overnight data load. I always want my engineered columns to show latest dataIf you are running time sensitive campaigns that respond immediately to recipient responses then you will want columns engineered off your Campaign History tables to update with new response data as soon as it is available. If you fall into this category then you do not need to do anything; the default behavior is for columns to update as soon as the underlying data or a dependent column is updated. Make users aware which columns only change overnight, and which are based on Campaign History tables that are likely to change during the day. I don’t want my data to changeThere are valid business reasons for creating columns that are snapshots of the data at a point in time. For example, you may create an aggregate column for reporting on Campaign History that is created as part of the overnight data load for a report that is generated and distributed once a day. The column will be dropped and re-created as part of the load processes tomorrow night, so re-calculating it multiple times during the day serves no purpose and causes unnecessary processing. In this case, you would use the iLoader Engineer_As_Snapshot command. ENGINEER_AS_SNAPSHOT TRUE/FALSE e.g. ENGINEER_AS_SNAPSHOT TRUE Once set to If this command is not present, engineered columns will be created with SNAPSHOT = False as default. This DROP_SNAPSHOTS Full Name of Database or Table e.g. DROP_SNAPSHOTS [Demo].[Customer] ![]() The following table summarizes behavior when creating engineered columns via iLoader:
Scheduling LoadsIt is common practice for partners to schedule the load of data into an Engine repository. There are a number of options available:
All approaches are valid and used by Alterian customers.
![]() iLoaderClient can be run from the command line or shortcut with arguments that affect its behavior. Arguments can be pre-fixed by hyphen (–) or forward slash (/) /r -r Arguments have one of two forms:
Delimiters can be enclosed in quotes if desired. If no qualifier is present, the argument value is assumed to extend to the start of the next argument or to the end of the command line, whichever comes first. /s “XXXX” /s XXXX NVP delimiters can be specified using a space( ) or a semi-colon (;) /Project: ProjectName -Project MyProjectName Argument names are CASE-insensitive.
Calling Campaign Manager API Methods From iLoaderThe following syntax is used to call Campaign Manager API Methods from within iLoader: CALLCMAPI <methodname> e.g. CALLCMAPI RunDocument If you encounter issues using this syntax e.g. 'msxml3.dll: The download of the specified resource has failed', then you can prefix the method name with an underscore character to use a different technique to call the method e.g. CALLCMAPI _RunDocument |
Online & Instructor-Led Courses | Training Videos | Webinar Recordings | ![]() |
|
![]() |
© Alterian. All Rights Reserved. | Privacy Policy | Legal Notice | ![]() ![]() ![]() |